Speech Bandwidth Extension Using Articulatory Features

نویسنده

  • Mark Hasegawa-Johnson
چکیده

In this paper, we present a technique for bandwidth extension (BWE) of a narrowband (0 4 kHz) signal using articulatory features. The proposed technique recovers high-band components (4 8 kHz) through Gaussian mixture regression (GMR) on both the acoustic and articulatory features from the X-ray Microbeam (XRMB) speech production database. The Gaussian mixture model (GMM) that is based on acoustic and articulatory features is initialized using k-means and iteratively trained using expectation-maximization (EM) algorithm. BWE experiments were run using data files from different speakers in XRMB as train and test data. Time-frequency plots of speech recovered by different training methods are presented in order to show that articulatory trajectories are helpful in characterizing high frequencied consonants in speech. Finally, we confirm our hypothesis that using GMM with articulation gives better recovery rate is true by performing Student’s t-test on SNR data between original and recovered speech.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Low-Frequency Bandwidth Extension of Telephone Speech Using Sinusoidal Synthesis and Gaussian Mixture Model

The limited audio bandwidth of narrowband telephone speech degrades the speech quality. This paper proposes a method that extends the bandwidth of telephone speech to the frequency range 0–300 Hz. The lowest harmonics of voiced speech are generated using sinusoidal synthesis. The energy in the extension band is estimated from spectral features using a Gaussian mixture model. The amplitudes and ...

متن کامل

Articulatory Features for Robust Visual Speech Recognition by Ekaterina Saenko

This thesis explores a novel approach to visual speech modeling. Visual speech, or a sequence of images of the speaker's face, is traditionally viewed as a single stream of contiguous units, each corresponding to a phonetic segment. These units are defined heuristically by mapping several visually similar phonemes to one visual phoneme, sometimes referred to as a viseme. However, experimental e...

متن کامل

Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion.

An automatic speech recognition approach is presented which uses articulatory features estimated by a subject-independent acoustic-to-articulatory inversion. The inversion allows estimation of articulatory features from any talker's speech acoustics using only an exemplary subject's articulatory-to-acoustic map. Results are reported on a broad class phonetic classification experiment on speech ...

متن کامل

Articulatory features for speech-driven head motion synthesis

This study investigates the use of articulatory features for speech-driven head motion synthesis as opposed to prosody features such as F0 and energy that have been mainly used in the literature. In the proposed approach, multi-stream HMMs are trained jointly on the synchronous streams of speech and head motion data. Articulatory features can be regarded as an intermediate parametrisation of sp...

متن کامل

Investigation of the acoustic features of emotional speech using a physiological articulatory model

Processing emotional speech is an important issue for speech information science and there are many studies working on this issue. However, we still have no clear knowledge to answer what are the crucial acoustic features for emotional speech, except the fundamental frequency, and how human manipulate their speech organs to generate emotional speech. In this study, we investigate the acoustic f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011